Search CORE

661 research outputs found

Basis Expansions for Functional Snippets

Author: Lin Zhenhua
Wang Jane-Ling
Zhong Qixian
Publication venue
Publication date: 29/08/2020
Field of study

Estimation of mean and covariance functions is fundamental for functional data analysis. While this topic has been studied extensively in the literature, a key assumption is that there are enough data in the domain of interest to estimate both the mean and covariance functions. In this paper, we investigate mean and covariance estimation for functional snippets in which observations from a subject are available only in an interval of length strictly (and often much) shorter than the length of the whole interval of interest. For such a sampling plan, no data is available for direct estimation of the off-diagonal region of the covariance function. We tackle this challenge via a basis representation of the covariance function. The proposed approach allows one to consistently estimate an infinite-rank covariance function from functional snippets. We establish the convergence rates for the proposed estimators and illustrate their finite-sample performance via simulation studies and two data applications.Comment: 51 pages, 10 figure

arXiv.org e-Print Archive

ScholarBank@NUS

Logistic Regression and Classification with non-Euclidean Covariates

Author: Lin Yinan
Lin Zhenhua
Publication venue
Publication date: 08/10/2023
Field of study

We introduce a logistic regression model for data pairs consisting of a binary response and a covariate residing in a non-Euclidean metric space without vector structures. Based on the proposed model we also develop a binary classifier for non-Euclidean objects. We propose a maximum likelihood estimator for the non-Euclidean regression coefficient in the model, and provide upper bounds on the estimation error under various metric entropy conditions that quantify complexity of the underlying metric space. Matching lower bounds are derived for the important metric spaces commonly seen in statistics, establishing optimality of the proposed estimator in such spaces. Similarly, an upper bound on the excess risk of the developed classifier is provided for general metric spaces. A finer upper bound and a matching lower bound, and thus optimality of the proposed classifier, are established for Riemannian manifolds. We investigate the numerical performance of the proposed estimator and classifier via simulation studies, and illustrate their practical merits via an application to task-related fMRI data.Comment: This revision contains the following updates: (1) The parameter space is allowed to be unbounded; (2) Some upper bounds are tightene

arXiv.org e-Print Archive

Online Algorithms for Geographical Load Balancing

Author: Andrew Lachlan L. H.
Lin Minghong
Liu Zhenhua
Wierman Adam
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

It has recently been proposed that Internet energy costs, both monetary and environmental, can be reduced by exploiting temporal variations and shifting processing to data centers located in regions where energy currently has low cost. Lightly loaded data centers can then turn off surplus servers. This paper studies online algorithms for determining the number of servers to leave on in each data center, and then uses these algorithms to study the environmental potential of geographical load balancing (GLB). A commonly suggested algorithm for this setting is “receding horizon control” (RHC), which computes the provisioning for the current time by optimizing over a window of predicted future loads. We show that RHC performs well in a homogeneous setting, in which all servers can serve all jobs equally well; however, we also prove that differences in propagation delays, servers, and electricity prices can cause RHC perform badly, So, we introduce variants of RHC that are guaranteed to perform as well in the face of such heterogeneity. These algorithms are then used to study the feasibility of powering a continent-wide set of data centers mostly by renewable sources, and to understand what portfolio of renewable energy is most effective

CiteSeerX

Crossref

Caltech Authors

Swinburne Research Bank